FractalNet: Ultra-Deep Neural Networks without Residuals
نویسندگان
چکیده
We introduce a design strategy for neural network macro-architecture based on selfsimilarity. Repeated application of a single expansion rule generates an extremely deep network whose structural layout is precisely a truncated fractal. Such a network contains interacting subpaths of different lengths, but does not include any pass-through connections: every internal signal is transformed by a filter and nonlinearity before being seen by subsequent layers. This property stands in stark contrast to the current approach of explicitly structuring very deep networks so that training is a residual learning problem. Our experiments demonstrate that residual representation is not fundamental to the success of extremely deep convolutional neural networks. A fractal design achieves an error rate of 22.85% on CIFAR-100, matching the state-of-the-art held by residual networks. Fractal networks exhibit intriguing properties beyond their high performance. They can be regarded as a computationally efficient implicit union of subnetworks of every depth. We explore consequences for training, touching upon connection with student-teacher behavior, and, most importantly, demonstrating the ability to extract high-performance fixed-depth subnetworks. To facilitate this latter task, we develop drop-path, a natural extension of dropout, to regularize co-adaptation of subpaths in fractal architectures. With such regularization, fractal networks exhibit an anytime property: shallow subnetworks provide a quick answer, while deeper subnetworks, with higher latency, provide a more accurate answer.
منابع مشابه
Deep Convolutional Neural Network Design Patterns
Recent research in the deep learning field has produced a plethora of new architectures. At the same time, a growing number of groups are applying deep learning to new applications. Some of these groups are likely to be composed of inexperienced deep learning practitioners who are baffled by the dizzying array of architecture choices and therefore opt to use an older architecture (i.e., Alexnet...
متن کاملSMASH: One-Shot Model Architecture Search through HyperNetworks
Designing architectures for deep neural networks requires expert knowledge and substantial computation time. We propose a technique to accelerate architecture selection by learning an auxiliary HyperNet that generates the weights of a main model conditioned on that model’s architecture. By comparing the relative validation performance of networks with HyperNet-generated weights, we can effectiv...
متن کاملExploring the Depths of Recurrent Neural Networks with Stochastic Residual Learning
Recent advancements in feed-forward convolutional neural network architecture have unlocked the ability to effectively use ultra-deep neural networks with hundreds of layers. However, with a couple exceptions, these advancements have mostly been confined to the world of feed-forward convolutional neural networks for image recognition, and NLP tasks requiring recurrent networks have largely been...
متن کاملHandwritten Bangla Character Recognition Using The State-of-Art Deep Convolutional Neural Networks
In spite of advances in object recognition technology, Handwritten Bangla Character Recognition (HBCR) remains largely unsolved due to the presence of many ambiguous handwritten characters and excessively cursive Bangla handwritings. Even the best existing recognizers do not lead to satisfactory performance for practical applications related to Bangla character recognition and have much lower p...
متن کاملDeep neural networks for time series prediction with applications in ultra-short-term wind forecasting
The aim of this paper is to present deep neural network architectures and algorithms and explore their use in time series prediction. Existing and novel input variable selection algorithms and deep neural networks are applied for ultra-short-term wind prediction. Since gradient-based optimization starting from random initialization often appears to get stuck in poor solutions, recent research e...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1605.07648 شماره
صفحات -
تاریخ انتشار 2016